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New insights into the evolutionary history 
of biological nitrogen fixation 

Eric S. Boyd and John W. Peters * 

Department of Chemistry and Biochemistry and Department of Microbiology, Montana State University, Bozeman, 
Ml USA 

Nitrogenase, which catalyzes the ATP-dependent reduction of dinitrogen (N2) to ammonia 
(NH3), accounts for roughly half of the bioavailable nitrogen supporting extant life. The 
fundamental requirement for fixed forms of nitrogen for life on Earth, both at present 
and in the past, has led to broad and significant interest in the origin and evolution 
of biological N2 fixation. One key question is whether the limited availability of fixed 
nitrogen was a factor in life's origin or whether there were ample sources of fixed 
nitrogen produced by abiotic processes or delivered through the weathering of bolide 
impact materials to support this early life. If the latter, the key questions become what 
were the characteristics of the environment that precipitated the evolution of this oxygen 
sensitive process, when did this occur, and how was its subsequent evolutionary history 
impacted by the advent of oxygenic photosynthesis and the rise of oxygen in the Earth's 
biosphere. Since the availability of fixed sources of nitrogen capable of supporting early life 
is difficult to glean from the geologic record, there are limited means to get direct insights 
into these questions. Indirect insights, however, can be gained through phylogenetic 
studies of nitrogenase structural gene products and additional gene products involved 
in the biosynthesis of the complex metal-containing prosthetic groups associated with 
this enzyme complex. Insights gained from such studies, as reviewed herein, challenge 
traditional models for the evolution of biological nitrogen fixation and provide the basis for 
the development of new conceptual models that explain the stepwise evolution of this 
highly complex life sustaining process. 
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INTRODUCTION 

All life requires fixed sources of nitrogen (N) 
and its availability is what often limits pro- 
ductivity in natural systems (Falkowski, 1997). 
Most N on Earth is in the form of dinitro- 
gen (N2), which is not bio-available. On early 
Earth, fixed sources of N may have been sup- 
plied by abiotic processes such as electrical 
(i.e., lightning) based oxidation of N2 to nitric 
oxide (NO) (Yung and McElroy, 1979; Kasting 
and Walker, 1981) or mineral (e.g., ferrous 



sulfide) based reduction of N2 (Schoonen and 
Xu, 2001; Summers et al., 2012), nitrous oxide 
(Summers et al, 2012), or nitrite (NO^ )/nitrate 
(NO^) (Summers, 2005; Singireddyetal.,2012) 
to NH3. Abiotic sources of fixed N (e.g., 
NO, NO^, NO^, NH 3 ) are thought to have 
become limiting to an expanding global biome 
(Kasting and Siefert, 2001; Navarro-Gonzalez 
et al., 2001), which may have precipitated 
the innovation of biological mechanisms to 
reduce N2. 



Frontiers in Microbiology 



www.frontiersin.org 



August 2013 | Volume 4 | Article 201 | 1 



Boyd and Peters 



Natural history of biological nitrogen fixation 



The primary enzyme that catalyzes the 
reduction of N2 to bio-available NH3 today 
is the molybdenum (Mo)-dependent nitroge- 
nase (Nif) although other phylogenetically- 
related forms of nitrogenase that differ in their 
active site metal composition (termed alterna- 
tive nitrogenase, or Vnf & Anf) may also con- 
tribute NH3 in environments that are limiting 
in Mo (Joerger and Bishop, 1988; Kessler et al., 
1997). Nitrogenase catalyzes the production of 
half, if not more, of all of the fixed nitrogen 
on Earth today (Falkowski, 1997). As such, this 
process functions to relieve fixed N limitation 
in natural ecosystems (Zehr et al., 2003) and is 
likely to have a disproportionate effect on the 
functioning of an ecosystem, relative to inputs 
from other populations. Thus, organisms which 
fix nitrogen in natural communities have been 
described as keystone species (Hamilton et al., 
2011a). 

TAXONOMY, PHYLOGENY, AND 
PHYSIOLOGY OF ORGANISMS THAT FIX 
DINITROGEN (N 2 ) 

The taxonomic distribution of nitrogenase is 
curiously restricted to bacteria and archaea, 
with no known examples of the genes encoding 
for this process occurring within the eukarya 
(Raymond et al, 2004; Boyd et al, 2011a; Dos 
Santos et al, 2012). Within the archaea, nitroge- 
nase has a narrow distribution and is restricted 
to methanogens (Euryarcheota) within the 
orders Methanococcales, Methanobacteriales, 
Methanosarcinales and has yet to be identi- 
fied among members of the Crenarchaeota, 
Thaumarchaeota, or Nanoarchaeota. Likewise, 
nif exhibits a limited distribution among 
bacteria. For example, nif has been identi- 
fied in a number of aerobic soil bacteria and 
has been identified in the genomes of 21 of 
the 44 sequenced cyanobacterial genomes, 
including those that inhabit terrestrial (e.g., 
Cyanothece and Synechococcus strains) and 
marine (Crocosphaera watsonii) environments. 
In addition, nif gene clusters are com- 
monly detected in the genomes of Firmicutes, 
Chloroflexi, Chlorobi, and Bacteroidetes and 
in several lineages of Actinobacteria and 
Proteobacteria. 

N2 fixation is associated with a diversity 
of microorganisms that display a wide variety 
of physiologies that range from obligate aer- 
obes to obligate anaerobes (Raymond et al., 
2004; Boyd et al, 2011a; Dos Santos et al, 
2012). Since nitrogenase is very sensitive to 
oxygen (Gallon, 1981), different classes of aer- 
obic or facultative anaerobic organisms have 



evolved a number of mechanisms to perform 
N2 fixation in an otherwise oxic environment. 
These mechanisms will only be treated briefly 
in this article and the reader is referred to 
extensive reviews written previously that focus 
on this topic (Gallon, 1981; Berman-Frank 
et al, 2003). Probably the most recognized 
mechanism for fixing N2 in an oxic environ- 
ment is associated with symbiotic nitrogen fix- 
ation in which plants provide a microaerobic 
niche where oxygen tensions are maintained 
at low levels by a high affinity oxygen bind- 
ing protein known as leghemoglobin, which 
is produced by the host plant (Ott et al., 
2005). This strategy of O2 sequestration allows 
the symbiotic diazotroph (e.g., Rhizobia) to 
maintain aerobic respiration while catalyzing 
O2 sensitive N2 fixation. In addition, nitro- 
gen fixation occurs under anoxic conditions 
in strict anaerobes and only during periods 
of anaerobic growth in facultative anaerobes. 
Cyanobacteria, the only diazotrophic lineage 
that produces molecular O2 as a product of 
its metabolism, have developed a number of 
mechanisms to fixN2 (Fay, 1992; Berman-Frank 
et al., 2003). For example, non-filamentous 
cyanobacteria tend to operate on a diurnal 
cycle where N2 fixation is up-regulated at night 
when oxygen tensions have dropped due to 
concomitant decreases in the production of 
photosynthetic O2 and increased O2 consump- 
tion by co-inhabiting heterotrophic popula- 
tions. Alternatively, the co-occurrence of N2 
fixation and O2 production in filamentous 
cyanobacteria is made possible by spatial seg- 
regation of nitrogenase in anaerobic heterocyst 
structures where increased protection of the 
nitrogenase complex is achieved through the 
photoreduction of O2 to H2O in photosystem 
I (Milligan et al., 2007), also known as the 
Mehler reaction (Mehler, 1957). In contrast, in 
obligate aerobes the nitrogen fixation appara- 
tus is protected by what has been described as 
a cytochrome-dependent respiratory protection 
mechanism whereby high rates of respiration 
ensure the consumption of oxygen at the cell 
membrane thereby maintaining low intracellu- 
lar oxygen tensions (Poole and Hill, 1997). It 
is likely that these mechanisms emerged later 
in the evolutionary history of biological nitro- 
gen fixation due to the increased complexity 
of nif gene clusters associated with microor- 
ganisms adapted to fixing nitrogen in an oxy- 
genated atmosphere (Boyd et al., 2011a,b; Dos 
Santos et al., 2012). The simplest assemblages 
of specific genes associated with nitrogen fix- 
ation occur in strict anaerobes. Nevertheless, 
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tracing the evolutionary trajectory of this pro- 
cess and identifying the most ancient nitrogen 
fixers present in extant biology has been a chal- 
lenge. 

HOW ANCIENT IS BIOLOGICAL NITROGEN 
FIXATION? 

Biological nitrogen fixation has been suggested 
to be an ancient and perhaps even primor- 
dial process (Falkowski, 1997; Fani et al., 2000). 
This prevailing view is based on simulations of 
Archaean atmospheric chemistry that contend 
that decreasing CO2 concentrations and con- 
comitant decreases in abiotic N2 oxidation to 
NO led to a nitrogen crises at ~3.5 Ga (Kasting 
and Siefert, 2001). However, using the same 
logic, Navarro-Gonzalez argue that the nitro- 
gen crisis could have ensued much later, even as 
late as 2.2 Ga (Navarro-Gonzalez et al., 2001). 
Abiotic sources of nitrogen produced through 
mechanisms such as lightning discharge or min- 
eral based catalysis (Yung and McElroy, 1979; 
Schoonen and Xu, 2001) are thought to have 
become limiting to an expanding global biome. 
Since extant nitrogenase functions to relieve 
N limitation in ecosystems (Zehr et al., 2003; 
Rubio and Ludden, 2008), the imbalance in 
the supply and demand for fixed N is thought 
to have represented a strong selective pressure 
that may have precipitated the emergence of 
nitrogen fixation (Raymond et al., 2004; Boyd 
et al., 2011a). Little direct evidence exists, how- 
ever, with respect to the availability of ammo- 
nia or other reduced forms of nitrogen over 
the course of geological time, although sev- 
eral recent isotopic analyses of shale kerogens 
have suggested ample enough supply of ammo- 
nia to support nitrifying populations in the late 
archean, >2.5Ga (Garvin et al., 2009; Godfrey 
and Falkowski, 2009). 



While the geologic record cannot yet defini- 
tively reconcile when fixed sources of nitrogen 
became limiting, one can ask the general ques- 
tion of whether the overall distribution and 
phylogenetic history of nitrogenase and its asso- 
ciated functionalities in extant biology are con- 
sistent with a primordial process or a property of 
the Last Universal Common Ancestor (LUCA). 
Although widely distributed among bacteria, 
the distribution of the process is far from univer- 
sal among archaea, and as previously mentioned 
has never been identified among members of the 
eukarya (Boyd et al., 2011a; Dos Santos et al., 
2012). Moreover, unlike processes and function- 
alities that we ascribe to properties of LUCA, 
nitrogenase is not generally (note caveat below) 
associated with deeply rooted lineages identified 
by 16S ribosomal RNA evolutionary trajectories 
(Figure 1). 

Our recent screening of two representative 
Aquificales genomes [i.e., Thermocrinis albus 
(Wirth et al., 2010) and Hydrogenobacter 
thermophilus (Zeytun et al., 2011)] reveal 
the presence of nitrogenase gene clusters. 
The identification of nif gene clusters in the 
genomes of thermophilic members of the 
Aquificales, regarded by many as the most 
deeply rooted bacterial lineage (Reysenbach 
et al, 2005), prompted a re-analysis of the 
distribution of nif on a depiction of the tax- 
onomic tree of life (Figure 1). Although this 
analysis suggests that deeply rooted bacteria 
encode for nif (e.g., Aquificales) the limited 
distribution of nif among deeply branching 
archaea (e.g., Thaumarchaeota, Nanoarchaeota 
lineages) and deeply branching bacteria 
(Thermus/Deinococcus) suggests that nif may 
have been subject to extensive gene loss/lateral 
gene transfer or was not a property of the Last 
Universal Common Ancestor (LUCA). If Nif 



Last Universal Common Ancestor 
(LUCA) 

LUCA is the last common ancestor of 
all extant life prior to the divergence of 
archaea and bacteria and is believed to 
date to ~3.5 to 3.8 billion years ago 
(bya). Extant proteins that can be 
mapped back to LUCA through 
phylogenetic analyses suggest that the 
functionalities encoded by these genes 
were properties of life prior to the 
divergence of archaea and bacteria. 
Thus, placing proteins as either present 
or absent in LUCA represents a key 
mechanism by which evolutionary 
biologists can place biochemical 
reactivates in evolutionary time. 



| Bacteria | 

Proteobacteria 



Actinobacteria 
Firmicutes^ 
Bacteriodetes 




Deinococcusl 
Thermus 



Thermotogae 

Aquificae 



FIGURE 1 | A schematic of a 3 domain taxonomic tree of life with lineages that include nitrogen fixing 
organisms, as identified through genome screening for nilHDKENB, overlaid in blue. 
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Monophyletic/Paraphyletic 

Phylogenetic reconstructions of genes 
or proteins that reveal lineages 
comprised of just archaea, bacteria, or 
eukarya, relative to the branching of 
proteins derived from taxa representing 
the other domains, are termed 
monophyletic. Alternatively, 
phylogenetic reconstructions of genes 
or proteins that reveal lineages 
comprised of a mixture of archaea, 
bacteria, or eukarya, are termed 
paraphyletic. Monophyletic 
relationships observed in phylogenetic 
reconstructions of archaeal and 
bacterial proteins is suggestive of an 
origin prior to the divergence of these 
two domains at LUCA whereas 
paraphyletic (mixed branching order) 
relationships of archaeal and bacterial 
genes or proteins is suggestive of a role 
of lateral gene transfer in the evolution 
of a given protein and may suggest an 
origin for a given biochemical process 
after the divergence of archaea and 
bacteria from LUCA. 
Lateral Gene Transfer (LGT) 
LGT, or horizontal gene transfer, of 
genetic material is a predominant 
mechanism by which biochemical and 
metabolic evolution has occurred 
throughout geological time. A 
significant fraction of genes encoded by 
an organism are acquired through 
vertical transmission, or through the 
replication of chromosomal DNA and 
the direct acquisition of this material 
from a parental ancestor. LGT, on the 
other hand, represents a mechanism by 
which genetic material can be attained 
from organisms that often span 
phylogenetic boundaries. LGT can 
occur via the uptake of naked DNA 
(transformation), viral infection 
(transduction), or through direct 
transfer of DNA between two taxa 
(conjugation). The aforementioned 
mechanisms often require close spatial 
proximity of host and recipient cells, 
which implies an important role for 
ecology in dictating the distribution of 
taxa and subsequent LGT events. The 
latter is clearly demarcated in the 
evolutionary of nitrogenase. In 
particular, the phylogenetic 
relationships noted between 
nitrogenase genes associated with 
thermophilic members of the 
Aquificales and thermophilic members 
of the Deferribacteres implies that the 
acquisition of the ability to fix N2 in the 
Aquificales occurred in a high 
temperature environment where both 
ancestral populations were present. 



was a property of LUCA, then phylogenetic 
analyses of nif gene or protein sequences, would 
be expected to reveal reciprocally monophyletic 
bacterial and archaeal lineages (e.g., subtrees 
containing just archaeal homologs and bacterial 
homologs joined at LUCA). However, our 
previous maximum likelihood and Bayesian 
phylogenetic analyses of a concatenation of 
the structure proteins required for nitrogen 
fixation (homologs of H, D, and K, described 
below) indicate that archaea are paraphyletic 
with respect to bacteria (Boyd et al., 2011a,b), 
suggesting that Nif emerged after the diver- 
gence of archaea and bacteria. Additionally, 
our current maximum likelihood analysis 
of a concatenated HDK protein alignment 



block (Figure 2) indicates that Nif proteins 
from deeply rooted thermophilic members 
of the Aquificales were acquired recently 
through a lateral gene transfer with a more 
recently evolved and thermophilic member 
of the bacterial phylum Deferribacteres (e.g., 
ancestor of Calditerrivibrio nitroreducens or 
Denitrovibrio acetiphilus) (Figure 2). This 
suggests that Aquificales acquired nif in the 
recent evolutionary past from an exchange with 
a bacterial partner in a thermal environment. In 
further support of this hypothesis, numerous 
Aquificales genera (e.g., Hydrogenobaculum) 
do not encode nif (Romano et al., 2013), 
despite branching more basal than Thermocrinis 
and Hydrogenobacter in 16S rRNA gene 



Uncharacterized nitrogenase 
Mo-nitrogenase 
V-nitrogenase 
Fe-nitrogenase 




Sinorhizohium medicae WSM419 

Magnetospiiillum magnetotacticum MS-1 
Rhodobacter sphaeroides ATCC 1 7025 
Rhodospirilium centenum SW 
Zymomonas mobilis subsp. mobilis ZM4 
Methylacidipbilum infernorum\/4 
Beijerinckia indica subsp. indica ATCC 9039 
Methylocella silvestris BL2 
Acidithiobacillus ferrooxidans ATCC 23270 
Rhodopseudomonas palustris HaA2 
Lyngbya sp. PCC S106 
Anabaena variabilis ATCC 29413 nif 
Synechococcus sp. JA-2-3B'a 



Nostoc sp. PCC 7120 
lOOt-Tbermocrinis albus H1 11/12 



-Hydrogenobacter thermopbilus TK-6 



Denitrovibrio acetiphilus DSM 12809 
Calditerrivibrio nitroreducens DSM 19672 
Klebsiella pneumoniae 342 
Azotobacter vinelandii AvOP 
Thermodesulfovibho yellowstonii DSM 1 1 347 
Heliobacterium modesticaldum lce1 
Desulfitobacterium hafniense Y51 
Geobactersp. M21 
Geobacter lovleyi SZ 
Pebbacter carbinolicus DSM 2380 
Methanosarcina acetivorans str. C2A nif 
Methanosarcina maze! strain Goe1 

Desulfovibrio vulgaris subsp. vulgaris DP4 
Dehalococcoides ethenogenes 195 

10O1 Chloroherpeton thalassium ATCC 35110 

Chlorobium tepidum TLS 

Candidatus Azobacteroides pseudotrichonymphae CFP2 
Syntrophobacter fumaroxidans MPOB 
Opitutaceae bacterium TAV2 Nif 
Clostridium beijerinckii NCIMB 8052 
Clostridium acetobutyiicum ATCC 824 

Candidatus Methanosphaerula palustris E1-9c 
Desulfotomaculum reducens MI-1 
Alkaliphilus metaliiredigens QYM F 
Methanococcus aeolicus Nankai-3 
Methanothermococcus okinawensis IH1 
1 QQr Metbanocaldococcus sp. FS406-22 
— [*— Metbanocaldococcus vulcanius M7 
10 °L Metbanocaldococcus infernus ME 

Candidatus Desulforudis audaxviator MP104C 
Syntrophotbermus lipocalidus DSM 12680 
Caldicellulosiruptorsaccharolyticus DSM 8903 

inn i Methanosarcina acetivorans str. C2A 

— Chloroherpeton thalassium ATCC 35110 

irirj*<M Rhodopseudomonas palustris BisA53 

IUU| n — Azotobacter vinelandii Av OP 

Rhodospirilium rubrum ATCC 11170 

100i Azotobacter vinelandii AvOP 

Anabaena variabilis ATCC 29413 
Methanosarcina acetivorans str. C2A 
-Clostridium kluyveri NBRC 12016 
Methanococcus vannieliiSB 
Methanococcus maripaludis strain S2 
Methanobacterium thermoautotrophicum Delta H 



-Chlorobium limicola DSM 245 BchLNB 

-Anabaena variabilis ATCC 29413 BchLNB 



FIGURE 2 | Maximum-likelihood phylogenetic reconstruction of a concatenation of Nif/Anf/Vnf and 
uncharacterized HDK protein sequences. Individual protein sequences were aligned, concatenated, and subjected 
to evolutionary reconstruction as described previously (Boyd et al., 2011 b). The metal composition of the nitrogenase 
active site clusters are overlaid in blue (Nif), purple ("uncharacterized nitrogenase"), red (Vnf), and green (Anf). 
Bootstrap values are indicated at the nodes. Concatenations of paralogous proteins involved in the synthesis of 
chlorophyll/bacteriochlorophyll (Bch/ChlLNB) were used to root the phylogeny. The hash at the root was introduced to 
conserve space. 
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Gene duplication 

Gene duplication represents a primary 
mechanism by which molecular 
evolution and biochemical 
diversification can occur. These events 
are random and initially generate 
identical copies of genes that encode 
proteins with identical biochemical 
reactivities. Given that the biochemical 
functionalities of duplicated proteins 
are initially redundant, the duplicated 
genes are under different selective 
pressure and can evolve to generate 
novel functionalities that can lead to 
selectable and novel biochemical 
phenotypes. The nitrogenase/ 
protochlorophyllide reductase 
superfamily has undergone numerous 
gene duplication events that have 
ultimately led to protein products that 
are involved in two of the most 
important biochemical processes on 
earth: biological N2 fixation and 
photosynthesis. 



phylogenetic reconstructions (Eder and Huber, 
2002). Hydrogenobaculum spp. tend to populate 
acidic geothermal environments where NH^~ 
produced from magmatic degassing is in much 
higher supply (Holloway et al., 2011) whereas 
Thermocrinis and Hydrogenobacter tend to pop- 
ulate circumneutral to alkaline environments 
that are N limited (Reysenbach et al, 2005). 
Thus, the recent diversification of Aquificales 
into N limited environments may have been 
facilitated by acquisition of nif. Together, these 
findings add to a growing body of evidence 
suggesting that lateral gene transfer has played a 
significant role in expanding the taxonomic and 
ecological distribution of N2 fixation (Raymond 
et al, 2004; Kechris et al, 2006; Bolhuis et al, 
2009). 

In spite of the uncertainty and controversy 
that surrounds molecular dating techniques 
based on phylogenetic reconstructions 
(Bromham and Penny, 2003; Rutschmann, 
2006), our recent data-driven attempt at 
addressing the age of nitrogen fixation using 
molecular dating techniques places its origin 
within a window of ~1.5-2.2Ga (Boyd et al., 
2011a). This time frame corresponds to a 
period of earth history where inferred fixed 
N levels are thought to have become limiting, 
O2 concentrations began to increase, and 
dissolved molybdenum (Mo) concentrations 
started to increase (Navarro-Gonzalez et al., 
2001; Anbar and Knoll, 2002; Berman-Frank 
et al, 2003; Canfield, 2005; Anbar et al, 2007; 
Anbar, 2008). Many of the geochemical changes 
associated with this period of geological history 
are likely a consequence of the production of 
oxygen by proliferating populations of oxygenic 
phototrophs (Anbar and Knoll, 2002; Anbar, 
2008). The production of oxygen may have 
opened up new ecological niches, allowing 
the global biome to diversify and radiate into 
new environmental realms. This expansion of 
the biosphere would have created additional 
demand on the bioavailable N pool and may 
have increased the selective pressure to evolve 
a biological mechanism to increase the local 
bioavailable N pool. 

WHAT ARE THE MOST DEEPLY ROOTED EXTANT 
ORGANISMS THAT HARBOR NITROGENASE? 

To answer this question one must first define 
a set of criteria for the minimum number of 
genes that are required to catalyze N2 fixa- 
tion in extant organisms. This is not as sim- 
ple as it is in many other enzyme systems, 
since just the presence or absence of genes that 
encode the structural protein are insufficient to 



produce an active Mo-dependent nitrogenase. 
Rather, a series of additional genes are required 
to synthesize the complex iron-molybdenum 
cofactor (FeMo-co) located at the active site of 
nitrogenase (Rubio and Ludden, 2008). From 
previous genomic, biochemical, and molecular 
genetic studies of different microbial sources, 
we and others have established a set of crite- 
ria for biological nitrogen fixation that requires 
at a minimum the structural genes nifH, nifD, 
and nifK and three additional FeMo-cofactor 
biosynthetic genes nifE, nifN, and nifB (Boyd 
et al., 2011a; Dos Santos et al, 2012). This cri- 
teria is based primarily on deletion mutation 
analysis of nifEN (Ugalde et al., 1984; Jacobson 
et al, 1989; Roll et al., 1995; Hu et al, 2005) and 
nifB (Shah et al, 1994; Christiansen et al, 1998) 
which result in the production of an inactive 
and FeMo-cofactor-less nitrogenase. Moreover, 
nif clusters of all sequenced nitrogen fixers 
that have been characterized (note caveat with 
respect to "uncharacterized nitrogenase," as dis- 
cussed below) have at a minimum these six gene 
products (nifHDKENB) (Boyd et al., 2011a; Dos 
Santos et al., 2012). Using these criteria, we 
exploited specific genetic events involved in the 
evolution of these six proteins, with particular 
attention paid to those that are involved in the 
biosynthesis of the active site cluster, in order 
to identify which extant organism harbors the 
oldest nitrogenase. 

The first relationship that was exploited 
was that between the structural genes, nifDK 
that encode the MoFe protein (NifDK) and 
the paralogous genes, nifEN that encode a 
scaffold complex (NifEN) that functions in 
FeMo-cofactor biosynthesis. Primary amino 
acid sequence comparisons of NifD, NifK, NifE, 
and NifN reveal significant homology indicat- 
ing that these gene products evolved from a 
common ancestor (Brigle et al., 1987; Fani 
et al, 2000; Raymond et al., 2004; Soboh et al, 
2010; Boyd et al, 2011a). It has been sug- 
gested that three independent gene duplica- 
tions yielded these four related gene products 
(Fani et al., 2000). As depicted in Figure 3, 
the first duplication of an ancestor of nflD- 
like common ancestor resulting in a protein 
that over time diversified to form nifD. nifD 
was then duplicated and through subsequent 
diversification formed nifK, resulting in the het- 
erotetrameric MoFe protein, NifDK (Fani et al., 
2000; Raymond et al., 2004; Boyd et al, 2011a). 
Subsequently, a bicistronic duplication of nifDK 
is thought to have yielded nifEN. Phylogenetic 
reconstructions of NifD, NifK, NifE, and NifN 
reveal that NifE and NifN sequences are nested 
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FIGURE 3 | Hypothetical scheme depicting the evolution of nitrogenase from its protein ancestor. Parsimony 

suggests that the likely ancestor of these protein complexes was a Nf ID-like protein present in an ancestral 
methanogen. The movement of an ancestor of a Nf ID-like protein to anoxygenic phototrophs, and the diversification 
of this protein into BchN, would necessitate lateral gene transfer followed by a duplication event. In contrast, vertical 
inheritance of a duplicated NfID ancestor in a methanogen can account for proto NifD. The diversification of the 
duplicated NfID like ancestor into a proto homodimeric NifD (i.e., protonitrogenase) is presumed to have been 
precipitated by interaction with an ancestor of the radical SAM protein NifB, which in extant biology catalyzes the 
formation of the FeMo-co precursor, NifB-co, from simple FeS clusters. Here, NifB-co or the like could have 
serendipitously been inserted in the open active site cavity presumed to be present in the protonitrogenase ancestor 
(e.g., BchN- or Nf ID-like) conferring the ability to perhaps catalyze a low level of N2 reduction. A second duplication of 
nifD and the subsequent diverisification of this gene (loss of FeMo-co binding site) led to nifK. The later bicistronic 
duplication of nifDK and subsequent diversification of these genes to nifEN yielded the ability to further mature 
biosynthetic intermediates into FeMo-co. In this depiction, metal cofactor binding sites within proteins are indicated 
by lobes whereas those that likely bind organic cofactors (e.g., protochlorophyllide) are indicated by circles. Open 
lobes depict sites where a cluster similar to NifB-co may have been bound by a protonitrogenase. 



Parsimony 

The concept of parsimony suggests that 
one should accept the most simple 
explanation that fits the given evidence. 
In the case of molecular phylogenetics, 
this means that the explanation that 
requires the fewest assumptions and 
fewest evolutionary changes is the most 
likely. 



among NifD and NifK sequences, respectively, 
consistent with this evolutionary trajectory 
(Boyd et al., 2011a). The most basal branch- 
ing NifD, NifK, NifE, and NifN sequences are 
associated with hydrogenotrophic methanogens 
(Methanobacteriales, Methanococcales), suggest- 
ing this physiological class of organism to be 
most representative of the ancestor of Nif. This 
finding was further supported by a phyloge- 
netic analysis of a concatenation of NifHDK, 
which revealed this class of organism as the 
basal branch (Figure 2) (Boyd et al, 2011b). 
Importantly, the other early branching lineages 
of nitrogenase all derive from obligate anaerobes 
(Figure 2) (Boyd et al., 201 la,b), suggesting that 
this process emerged in an anoxic environment, 
consistent with the oxygen sensitivity of this 
enzyme (Rubio and Ludden, 2008). Moreover, 
these findings are consistent with our sugges- 
tion that the mechanisms organisms evolved to 



fix nitrogen in the presence of oxygen are more 
recent evolutionary innovations. 

The evolutionary history of NifB rein- 
forces results implicating hydrogenotrophic 
methanogens as the oldest nitrogen-fixing 
organisms. NifB is a radical generating S- 
adenosylmethionine dependent enzyme that is 
involved in generating the hexacoordinated car- 
bide at the center of the FeMo-cofactor (Wiig 
et al., 2012), the presumed key structural 
determinant of nitrogenase function. All Mo- 
nitrogenase systems identified to date encode 
for NifB, which is consistent with its essential 
role in the synthesis of the active site cofactor 
and with its presumed central role in the ori- 
gin of nitrogenase (Figure 3) (Soboh et al., 2010; 
Boyd et al, 2011a,b). Most NifB proteins associ- 
ated with extant organisms exist as two domain 
proteins composed of a radical SAM function- 
ality (SAM domain only) and a putative carrier 
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Gene fusions 

Gene fusions represent a hybrid of two 
or more previously separate 
(individual) genes. Gene fusions occur 
through a variety of random processes 
that often result in a selectable 
phenotype, which can render a more 
efficient biochemical process both in 
space (co-localization of gene 
products) or time (both genes are 
under the same transcriptional 
regulation). In the case of the gene 
fusion between the radical SAM 
domain protein encoded by nifB and 
the carrier protein encoded by nifX, it is 
believed that this genetic event enabled 
co-localization facilitating more 
efficient synthesis and transfer of the 
FeMo-co precursor NifB-co to NifEN. 



protein functionality (NifX domain) resulting 
from a fusion of a gene encoding the core SAM 
domain and a standalone gene, nifX (Rubio 
and Ludden, 2008). Phylogenetic reconstruc- 
tion of just the SAM domain indicate that the 
fusion of the gene encoding this domain with 
nifX is a recent evolutionary innovation (Boyd 
et al., 2011a). Methanogens, which harbor nifB 
homologs that diverged prior to the fusion with 
the nifX domain, branch at the base of the NifB- 
SAM domain tree (Boyd et al, 2011a). The fact 
that methanogens and other early descendants 
on the NifB phylogeny (e.g., firmicutes, chlo- 
roflexi) are strict anaerobes provide additional 
support that Mo-nitrogenase had its origin in an 
anoxic environment. 

ARE ALTERNATIVE NITROGENASES EVOLUTIONARY 
ANCESTORS OF MO-NITROGENASE? 

Although the majority of present-day biolog- 
ical N2 reduction is catalyzed by Nif (Rubio 
and Ludden, 2008), alternative forms exist with 
active site cofactors that lack Mo and contain 
vanadium and iron (V-nitrogenase) or iron only 
(Fe-only nitrogenase) (encoded by vnf and anf, 
respectively). The evolutionary trajectory of the 
different metal containing nitrogenases has been 
of keen interest since their discovery nearly 30 
years ago. It has been suggested that nitrogen 
fixation by Vnf or Anf might have preceded 
Nif prior to the Great Oxidation Event and the 
advent of oxygenic photosynthesis ~2.5-2.8 Ga 
(Anbar and Knoll, 2002; Raymond et al., 2004). 
This proposal was based on chemostratigraphic 
measurements that indicate limited Mo under 
the reducing environment of the early Earth 
(Anbar et al., 2007) where most Mo would have 
existed as complexes of insoluble Mo-sulfides 
(Helz et al., 1996). Although this logic is sound 
and the potential for alternative nitrogenases 
as ancestors of Mo-nitrogenase is a rational 
hypothesis, there are a number of observations 
that suggest that this is not likely to be the case. 

Alternative nitrogenases occur in a small 
number of organisms and to date have never 
been identified in taxa that do not also encode a 
Mo-nitrogenase (Boyd et al., 2011a). Moreover, 
gene clusters encoding the structural compo- 
nents of the alternative nitrogenases possess 
only a fraction of the cofactor biosynthetic genes 
required for FeMo-cofactor biosynthesis (Boyd 
et al., 2011a,b), implying a dependence on nif- 
encoded biosynthetic machinery. Indeed, tar- 
geted and global transcriptional analyses of the 
model nitrogen-fixing organism, Azotobacter 
vinelandii, indicates that the synthesis of an 
active alternative nitrogenase (Vnf- or Anf) 



requires the expression of a number of FeMo- 
cofactor synthetic gene products encoded by 
genes in nif clusters (Wolfinger and Bishop, 
1991; Hamilton et al, 2011b). Finally, extant 
Mo-containing forms of nitrogenase are signifi- 
cantly more efficient at binding N2 and reducing 
it to ammonia than V- and Fe-only nitrogenase 
(Joerger and Bishop, 1988; Eady, 1996), and 
would have presumably been highly selected for 
under conditions of fixed N limitation that are 
thought to have characterized ecosystems at this 
time (Navarro-Gonzalez et al., 2001; Berman- 
Frank et al., 2003). These observations make 
it difficult to rationalize an ancestry whereby 
Mo-nitrogenase arose from an alternative nitro- 
genase without invoking extensive gene loss 
and/or significant genomic rearrangement. 

The gene clusters associated with alterna- 
tive nitrogenase encode for only the structural 
proteins (HDK) and generally lack homologs 
of key biosynthetic genes (ENB). Exceptions 
where biosynthetic genes are also encoded in 
alternative gene clusters include the vnf operon 
in A. vinelandii and Rhodopseudomonas palus- 
tris CGA009 which encode for EN, although 
these copies are the result of a recent dupli- 
cation of nifEN in these taxa (Boyd et al., 
2011a). Thus, extant alternative nitrogenase 
operons do not encode for the complement 
of genes required to independently synthesize 
an active nitrogenase, and thus do not meet 
the criteria set forth above. Phylogenetic place- 
ment of alternative nitrogenase in the evolution 
of nitrogenase based on structural genes (D 
or K) has resulted in ambiguous results with 
respect to which form of nitrogenase is ances- 
tral (Boyd et al., 2011a). Likewise, phylogenetic 
analyses of Anf/Vnf/NifD alone or concatena- 
tions of Anf/Vnf/NifHD also lead to ambiguous 
results (Raymond et al., 2004). Since all three 
nitrogenase (Mo-, V-, and Fe-only) have ded- 
icated structural proteins (H, D, and K), we 
recently conducted a phylogenetic study of a 
concatenation of these three protein sequences 
(Boyd et al., 2011b). The well-resolved and 
strongly supported phylogenetic reconstruction 
(Figure 2) indicates that the alternative nitro- 
genases form a monophyletic lineage that is 
nested among Mo-nitrogenase, indicating that 
alternative nitrogenases are derived from Mo- 
nitrogenase (Boyd et al., 2011b). While the 
phylogenetic studies described in the chap- 
ter are limited by only being able to analyze 
sequenced extant organisms, they provide a 
compelling case for Mo-nitrogenase emerging 
prior to alternative nitrogenase. The branching 
order of alternative nitrogenase (i.e., the nesting 
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of this lineage among strictly anaerobic taxa) 
further suggests that the differentiation in metal 
usage in the nitrogenase isoforms is likely to 
have occurred in an anoxic environment. 

WHAT IS THE NATURE OF THE METAL 
COMPLEMENT OF "UNCHARACTERIZED 
NITROGENASES"? 

The aforementioned phylogenetic studies delin- 
eate the metal composition of nitrogenase 
homologs by phylogenetic clustering with pro- 
teins from organisms for which their nitroge- 
nase has been characterized to varying extents. 
Recently, a number of nitrogenase homologs 
that form a deep branching monophyletic lin- 
eage [albeit still derived from Mo-nitrogenase 
(Boyd et al., 2011b)] have been identified 
(Figure 2). These nitrogenase homologs form a 
novel lineage that does not harbor representa- 
tive sequences for which biochemical informa- 
tion about the active site cluster exists (Boyd 
et al, 2011b; Dos Santos et al, 2012), preclud- 
ing assignment of the metallic composition of 
their active site clusters (hence, "uncharacter- 
ized nitrogenase"). Some of these uncharacter- 
ized nitrogenase gene clusters don't obey our 
established criteria (i.e., requirement to encode 
for homologs of nifHDKENB) and instead com- 
prise only nifHDKEB (Boyd et al, 2011b). 
Nonetheless, isotopic tracer experiments sug- 
gest that organisms harboring uncharacterized 
nitrogenase are capable of incorporating N2 into 
biomass (Mehta and Baross, 2006; Dekas et al., 
2009). 

Key differences in the active site cofactor 
protein environment of the different metal 
containing nitrogenases can be used to clas- 
sify the hypothetical metal composition of 
nitrogenase homologs that have not been char- 
acterized biochemically. As mentioned previ- 
ously, alternative nitrogenases are less efficient 
as N2 reduction catalysts and previous biochem- 
ical studies have also shown interesting differ- 
ences in the other catalytic properties (Eady, 
1996). Alternative nitrogenases produce a larger 
proportion of hydrogen as a product in the 
nitrogenase reaction when compared to Mo- 
nitrogenase. Acetylene reduction catalyzed by 
the Mo-nitrogenase results in ethylene as the 
sole product in contrast to the alternative nitro- 
genases that produce detectable quantities of 
ethane in addition to ethylene. The recent obser- 
vation that nitrogenases are capable of hydro- 
carbon production with carbon monoxide as 
a substrate (Lee et al., 2010; Hu and Ribbe, 
2008) indicates that Mo-nitrogenases and V- 
nitrogenases have differing catalytic efficiencies 



for hydrocarbon production with V-nitrogenase 
having higher catalytic rates. Interestingly, with 
respect to these observations, it has been shown 
that simple site-specific amino acid substitu- 
tions of Mo-dependent nitrogenase can affect 
increased hydrocarbon production from carbon 
dioxide on the order of that observed for the 
carbon monoxide dependent hydrocarbon pro- 
duction catalyzed by the V-dependent nitroge- 
nase (Yang et al., 2012). These results indicate 
that the combination of the metal content and 
cofactor protein environment that make up the 
structural determinants account for the subtle 
differences in substrate reduction properties of 
the different metal-dependent nitrogenases. 

Recently, we conducted a fairly exhaus- 
tive study of the polypepetide environment of 
the deeply rooted uncharacterized nitrogenase 
based on homology models (Mcglynn et al., 
2012). This work clearly indicated that the 
uncharacterized nitrogenases are more likely 
to be Mo-dependent nitrogenases than V- or 
Fe-dependent nitrogenase. Given the organ- 
isms that possess uncharacterized nitrogenases 
occupy anaerobic niches, these findings are in 
line with our previous studies indicating that 
the oldest extant nitrogen-fixing organisms are 
anaerobes and that biological nitrogen fixation 
had its origins in an anoxic environment. In 
addition, the observation that these more deeply 
rooted uncharacterized nitrogenases are likely to 
be Mo-dependent further supports our previ- 
ous observations indicating that Mo-dependent 
nitrogenase are ancestral to the alternative nitro- 
genases. 

IS THERE EVOLUTIONARY RELEVANCE TO 
NITROGENASE PROMISCUITY? 

The ability of nitrogenase to reduce other sub- 
strates such as acetylene, cyanide and their abil- 
ity to convert carbon monoxide and carbon 
dioxide to hydrocarbon products has enticed 
some to propose that nitrogenase may have its 
evolutionary roots in one or more of these cat- 
alytic activities (Hu et al, 2011). This is an 
interesting idea considering that the barrier for 
these reactions are lower which in turn might 
afford a stepwise path to achieving an enzyme 
capable of overcoming the enormous activation 
barrier of N2 activation. However, homologs 
of nitrogenase that have the dedicated func- 
tion of specifically reducing these types of sub- 
strates or analogous compounds in vivo have 
not been identified in extant biology and it 
is difficult to envision conditions that would 
have a strong selective pressure for such pro- 
cesses. The recent observation that Mo- and 
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V-dependent nitrogenases are capable of hydro- 
carbon production from carbon monoxide has 
been used to suggest a link between nitroge- 
nase dependent production of reduced carbon 
and nitrogen (Hu et al., 2011). Although an 
intriguing idea, it is hard to imagine a selec- 
tive pressure that would select for such a com- 
plicated and ATP dependent mechanism for 
generating reduced carbon early during life's 
evolution, especially in light of the presence 
of other viable mechanisms of carbon reduc- 
tion (e.g., CO or CO2) (Ragsdale and Pierce, 
2008; Fuchs, 2011). The Wood-Ljungdhal path- 
way is a prime example of a mechanism that 
was likely to already exist when nitrogenase 
evolved (Martin and Russell, 2007; Poehlein 
et al., 2012; Nitschke et al, 2013) and the 
key enzyme, the carbon monoxide dehydro- 
genase/acetyl CoA synthase, for example has 
a much more common occurrence in deeply 
rooted microorganisms than nitrogenase and 
is by all accounts a more ancient enzyme. We 
propose that z) nitrogenase evolved in response 
to selective pressure of fixed nitrogen avail- 
ability and ii) nitrogenase promiscuity and the 
ability to reduce other substrates is a prod- 
uct of evolving a redox enzyme that can over- 
come the largest activation barrier in biology 
(Rees, 1993). 

WHAT IS THE EVOLUTIONARY ORIGIN OF 
NITROGENASE? 

The question of the ancestor of nitroge- 
nase is intriguing and is framed by numer- 
ous paradigms that are not strongly sup- 
ported by empirical observations gleaned from 
extant biology. In today's world, the availabil- 
ity of fixed nitrogen limits global nutrition and 



productivity (Falkowski, 1997); it is likely that 
this anthropogenic-centered focus leads to the 
tendency to place the emergence of biological 
nitrogen fixation as a very early event and per- 
haps even a primordial process. However, even 
the simplest evolutionary observations such as 
the aforementioned limited association of bio- 
logical nitrogen fixation in deeply rooted lin- 
eages are not consistent with this process being 
a property of LUCA. The history of nitrogen 
availability is not something that can, as of yet, 
be ascertained from the geologic record so the 
true time at which selective pressure was suf- 
ficient to affect the emergence of such a com- 
plicated biochemical process is unclear. Some 
insights, however, can be assembled from evolu- 
tionary relationships of nitrogenase with other 
paralogous protein complexes associated with 
chlorophyll and bacteriochlorophyll biosynthe- 
sis (Burke et al., 1993; Xiong et al, 2000; Brocker 
et al., 2008) and the related enzyme complex 
proposed to be involved in cofactor F430 biosyn- 
thesis present in methanogens (Raymond et al., 
2004; Staples et al, 2007; Boyd et al, 2011b) 
(Figures 3, 4). 

There is an emerging body of work on 
the biochemistry of the dark operative pro- 
tochlorophyllide reductase complex involved 
in bacteriochlorphyll biosynthesis (Fujita 
and Bauer, 2000; Brocker et al, 2008, 2010; 
Sarma et al, 2008; Muraki et al., 2010). In 
brief, the enzyme catalyzes the stereo-specific 
reduction of the C17-C18 double bond of 
the D-ring of protochlorophyllide to form 
chlorophyllide. Presumably the nature of this 
stereo-specific reduction is facilitated by an 
analogous gated electron transfer mechanism 
required in biological nitrogen fixation and thus 




FIGURE 4 I Phylogenetic relationships between Anf/Vnf/NifD, Bch/ChIN, and NfID proteins, as reproduced 
from Boyd et al. (2011b). Parsimony would suggest that the ancestor of this paralogous group of proteins likely 
harbored an open active site cavity similar to that which is present in modern NfID or Chl/BchN proteins. The 
ancestor of these protein complexes (at the trifurcation point of the tree) likely encoded a single structural protein 
approximating NfID. A series of ancient duplications followed by independent evolution yielded the precursor to the 
heterotetrameric BchNB and NifDK complex (See Figure 3 for a schematic outlining this evolutionary trajectory). 



Frontiers in Microbiology 



www.frontiersin.org 



August 2013 | Volume 4 | Article 201 | 9 



Boyd and Peters 



Natural history of biological nitrogen fixation 



involves an analogous enzyme complex hav- 
ing a homolog of the nitrogenase Fe protein, 
BchL, and a MoFe protein analogous compo- 
nent, BchNB. Whereas NifDK harbors a cav- 
ity where FeMo-co binds, BchNB possesses 
a cavity where protochlorophyllide binds and 
where substrate reduction occurs (Brocker et al., 
2010; Muraki et al, 2010), the latter of which 
involves BchL-dependent electron transfer reac- 
tions to affect substrate reduction (Sarma et al., 
2008). 

Relating the arguably simpler protochloro- 
phyllide reductase to the more complex 
cofactor-containing nitrogenase from a struc- 
tural perspective, parsimony would invoke 
that the simplest structure is the evolutionary 
ancestor (Figure 3). That is to say the simplest 
evolutionary trajectory is one in which the com- 
mon ancestor approximates the structure of the 
protochlorophyllide reductase (Brocker et al., 
2010; Muraki et al, 2010) or a cofactor-less 
nitrogenase (Schmid et al., 2002). The mech- 
anism of cofactor biosynthesis is an additional 
source of insight when thinking of plausible 
scenarios for the evolution of nitrogenase. The 
final step in nitrogenase enzyme maturation is 
in fact the insertion of a preformed cofactor (on 
NifEN) into a cofactor-less nitrogenase (NifDK) 
(Rubio and Ludden, 2005, 2008; Hu and 
Ribbe, 2008) that for all intents and purposes 
approximates the salient structural features 



of the protochlorophyllide reductase (BchNB) 
(Figure 4). Interestingly, neither the cofactor- 
less nitrogenase nor the cofactor on its own have 
nitrogen reducing activity, observations that 
when considered in the context of evolution- 
ary history of the genes required to synthesize 
an active nitrogenase (see above) narrow down 
viable scenarios. The most parsimonious sce- 
nario involves a common ancestor with an open 
cavity that resembles the protochlorophyllide 
reductase serendipitously binding a modified 
iron-sulfur cluster fragment and the result is 
a protonitrogenase with a small, albeit highly 
selectable, level of nitrogen reducing activity 
(Soboh et al, 2010; Boyd et al, 2011b). The 
cluster fragment could be in the form of a radical 
SAM modified carbide iron-sulfur cluster sim- 
ilar to the cofactor intermediate formed at an 
early state in FeMo-cofactor biosynthesis (e.g., 
NifB-co) (Wiig et al., 2012). In this scenario, the 
nitrogenase would be under continual selective 
pressure to improve catalytic efficiencies, which 
would provide the selective impetus to adapt 
the active site cluster into FeMo-co through 
refinement (emergence of nifEN) of the compli- 
cated biosynthetic pathway observed in extant 
biology. 
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